Published online 29 May 2014 



Nucleic Acids Research, 2014, Vol. 42, No. 12 7591-7610 

doi: 10.1 093 Inar/gku451 



SPIB and BATF provide alternate determinants of IRF4 
occupancy in diffuse large B-cell lymphoma linked to 
disease heterogeneity 

Matthew A. Care 1,2,t , Mario Cocco 1t , Jon P. Laye 1 , Nicholas Barnes 1 , Yuanxue Huang 3 , 
Ming Wang 3 , Sharon Barrans 4 , Ming Du 3 , Andrew Jack 4 , David R. Westhead 2 , Gina 
M. Doody 1 and Reuben M. Tooze 14 * 

1 Section of Experimental Haematology, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK, 
2 Bioinformatics Group, School of Molecular and Cellular Biology, University of Leeds, Leeds, UK, 3 Division of 
Molecular Histopathology, Department of Pathology, University of Cambridge, Cambridge, UK and 4 Haematological 
Malignancy Diagnostic Service, Leeds Cancer Centre, Leeds Teaching Hospitals NHS Trust, Leeds, UK 

Received September 13, 2013; Revised May 06, 2014; Accepted May 8, 2014 



ABSTRACT 

Interferon regulatory factor 4 (IRF4) is central to the 
transcriptional network of activated B-cell-like dif- 
fuse large B-cell lymphoma (ABC-DLBCL), an ag- 
gressive lymphoma subgroup defined by gene ex- 
pression profiling. Since cofactor association modi- 
fies transcriptional regulatory input by IRF4, we as- 
sessed genome occupancy by IRF4 and endogenous 
cofactors in ABC-DLBCL cell lines. IRF4 partners 
with SPIB, PU.1 and BATF genome-wide, but SPIB 
provides the dominant IRF4 partner in this context. 
Upon SPIB knockdown IRF4 occupancy is depleted 
and neither PU.1 nor BATF acutely compensates. In- 
tegration with ENCODE data from lymphoblastoid 
cell line GM12878, demonstrates that IRF4 adopts 
either SPIB- or BATF-centric genome-wide distribu- 
tions in related states of post-germinal centre B- 
cell transformation. In primary DLBCL high-SP/E 
and \ow-BATF or the reciprocal low-SP/E and high- 
BATF mRNA expression links to differential gene 
expression profiles across nine data sets, identify- 
ing distinct associations with SPIB occupancy, sig- 
natures of B-cell differentiation stage and poten- 
tial pathogenetic mechanisms. In a population-based 
patient cohort, SPIB high /BATF low -ABC-DLBCL is en- 
riched for mutation of MYD88, and SPIB hi 9 h /BATF low - 
ABC-DLBCL with MYD88-L265P mutation identifies a 
small subgroup of patients among this otherwise ag- 
gressive disease subgroup with distinct favourable 
outcome. We conclude that differential expression 
of IRF4 cofactors SPIB and BATF identifies biologi- 



cally and clinically significant heterogeneity among 
ABC-DLBCL. 



INTRODUCTION 

Classification based on gene expression has linked clini- 
cal response to molecular biology in diffuse large B-cell 
lymphoma (DLBCL), an aggressive and common form 
of human lymphoma. The cell of origin classification has 
become a prevailing paradigm, and divides DLBCL into 
groups based on their relationship to normal B-cell coun- 
terparts: the germinal centre B-cell (GCB) and the acti- 
vated B-cell type (ABC) (1). ABC-DLBCL has a worse 
prognosis on currently standard immunochemo therapy reg- 
imen R-CHOP (rituximab, cyclophosphamide, hydroxy- 
daunorubicin, Oncovin, prednisolone) and is related to the 
continuum of activated cell states that lie between B-cells 
and plasma cells. This continuum is linked to a reorganiz- 
ing transcriptional network driven by changes in expression 
of core transcriptional regulators. We reasoned that varia- 
tion in expression of these transcriptional regulators might 
equally contribute to heterogeneity within ABC-DLBCL. 

Interferon regulatory factor 4 (IRF4) is a defining fea- 
ture of ABC-DLBCL and in normal B-cells is essential for 
the initiation of plasma cell differentiation (2-5). The DNA- 
binding domain of IRF4 is restricted via an autoinhibitory 
interaction (6), and release depends primarily on binding 
to transcription factor partners. Two principle cofactors 
of IRF4 are the ETS-family proteins PU.1 and SPIB, at 
ETS/IRF Composite Elements (EICE) (7-9). While highly 
related, SPIB and PU.1 are only partially redundant and 
are essential for mature B-cell survival (10-12). SPIB can 
additionally act to prevent plasma cell differentiation by re- 
pressing PRDM1/BLIMP1 (13). In ABC-DLBCL SPIB is 
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of particular relevance as this gene can be subject to dereg- 
ulation by amplification or translocation leading to hetero- 
geneity in SPIB expression (14,15). A recent study reported 
on the role of SPIB in ABC-DLBCL using biotin-tagged 
SPIB for ChlP-seq assays, and concluded that SPIB/IRF4 
heterodimers were central to ABC-DLBCL pathogenesis 
potentially regulating B-cell receptor signalling pathways 
and interferon-a (IFNa) secretion downstream of MYD88 
mutations (16). However, the contribution of endogenous 
partners to regulatory element usage by IRF4 was not di- 
rectly assessed. 

BATF, an API -family protein (17), was recently de- 
scribed as a principle IRF4 partner at AP1/IRF Compos- 
ite Elements (AICEs) in T-cells and dendritic cells (18-21). 
This partnership was also observed in cytokine stimulated 
B-cells (19,20). BATF plays an essential role in both T- and 
B-cells during humoral immune responses, with a require- 
ment in the germinal centre and the regulation of class- 
switch recombination via AICDA (22,23). However, in the 
context of B-cell malignancy BATF is consistently associ- 
ated with ABC-DLBCL, representative of a post-germinal 
centre state, rather than GCB-DLBCL (24); BATF thus 
provides a potential alternate partner for IRF4 in this con- 
text. 

Here, we have addressed the relationship between IRF4 
and its endogenous partners in ABC-DLBCL. Our results 
demonstrate that SPIB does provide the functionally dom- 
inant IRF4 partner in ABC-DLBCL with SPIB deregula- 
tion, however, in this context BATF provides an alternative 
IRF4 partner genome-wide. We find that in primary ABC- 
DLBCL, variation in the expression of SPIB and BATF is 
associated with clinical and biological heterogeneity. Strong 
expression of SPIB relative to BATF is linked with bet- 
ter overall survival, MYD88 mutations and expression of 
genes associated with SPIB occupancy and B-cell rather 
than plasmablast or plasma cell state. 

MATERIALS AND METHODS 

Antibodies and primers 

Antibodies used were: IRF4 antibody (sc-28696X), PU.l 
antibody (sc-352X), BATF antibody (sc-100974X, Santa 
Cruz), BLIMP 1 polyconal antibody as described (25), 
monoclonal anti-(3-ACTIN (clone AC- 15, Sigma), rabbit 
anti-mouse immunoglobulin G (IgG; Jackson ImmunoRe- 
search), control rabbit IgG (Upstate Biotechnology), con- 
trol mouse IgG (Sigma), anti-MYC clone 9E10. 

Vectors and antibody generation 

Coding sequence for human SPIB a. a. 1-51 was cloned into 
pGEX6Pl between EcoRI and Bglll and sequence veri- 
fied, for primers see Supplementary Methods. GST-fusion 
proteins were expressed according to manufacturer's in- 
structions (Amersham) and used to generate rabbit poly- 
clonal antisera according to standard procedures (Harlan 
Seralab). 

Myc-epitope tagged coding sequence for human SPIB, 
SPIl(PU.l) and IRF4 were cloned into pIRES2EGFP 
(Clontech) between EcoRI and Bglll for SPIB, EcoRI and 
BamHI for SPI1/PU.1 and IRF4, and sequence verified. 



Cell lines, culture, transfection and knockdown 

H929, HeLa and COS cells were cultured in RPMI1640 
media and OCI-LY3, OCI-LY10 (kind gift of Prof. R.E. 
Davis) in Iscove's Modified Dulbecco's Medium (IMDM) 
with GlutaMAX™ (Life Technologies™), each contain- 
ing 10% heat inactivated fetal calf serum. COS and HeLa 
cells were transfected with GeneJuice (Novagen) according 
to the manufacturer's instructions. 

Western blot, ChIP and Electrophoretic Mobility Shift Assay 

Western blots were performed according to standard pro- 
cedures. ChIP and electrophoretic mobility shift assay 
(EMS A) were performed as described (26). For BATF the 
ChIP method was adapted such that protein A Sepharose 
(Thermo Scientific) was first saturated with rabbit anti- 
mouse secondary antibody, and then incubated with anti- 
BATF or control mouse IgG. Pre-bound beads were used 
to immunoprecipiate chromatin fractions. Nuclear extracts 
for EMSA were prepared from transfected COS cells, and 
OCI-LY3 and -LY10 cell lines. For EMSA probe sequences 
and ChIP PCR primer sequences see Supplementary Meth- 
ods. 

Library preparation and sequencing 

Library preparation for input chromatin, IRF4, SPIB and 
PU.l was performed using the Illumina ChlP-seq Sample 
Prep Kit (Illumina®) according to manufacturer's instruc- 
tions, and run on a GAIIx Genome Analyser (Illumina®, 
Little Chesterford, UK). BATF samples, and libraries 
generated for control siRNA and SPIB siRNA treated 
chromatin were prepared using the MicroPlex Library 
Preparation™ kit (Diagenode) for ChIP, size selected us- 
ing AMPure XP beads (Beckman Coulter) and run on an 
Illumina Hiseq 2500. 

siRNA knockdown, RNA extraction and gene expression 
analysis 

OCI-LY3 and OCI-LY10 cells were transfected 
with SPIB siRNA (sl3354; 4392420— Ambion Life 
Technologies™) or control non-targeting siRNA, (27) 
using Amaxa® Nucleofector® system and Amaxa® Cell 
line Nucleofector® Kit V (Lonza), setting D.023, according 
to manufacturer's instructions. RNA was extracted with 
TRIzol® and amplified using Illumina® TotalPrep™-96 
RNA Amplification Kit (Life Technologies™). Resulting 
cRNAs were then hybridized to BeadChips using the 
HumanHT-12 v4 Expression BeadChip Kit according to 
manufacturer's instructions, and the BeadChips scanned 
with the Illumina BeadArray Reader (Illumina®). Analysis 
was performed as previously described (28). 

MYD88 mutation screening 

Primary DLBCL samples related to GSE32918 were re- 
trieved from the Haematological Malignancy Diagnostic 
Service of Leeds Teaching Hospital NHS Trust. DNA from 
representative sections was extracted using standard pro- 
teinase K digestion and the QIAamp DNA Micro Kit (QI- 
AGEN, Crawley, UK). MYD88 gene mutation was screened 
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by PCR and Sanger sequencing. The primer sequences and 
PCR conditions are detailed in Supplementary Methods. 
PCR products were sequenced using the BigDye Termina- 
tor v3.1 (Applied Biosystems, Foster City, CA, USA). Se- 
quence changes were confirmed by at least two indepen- 
dent PCR and sequencing experiments. The somatic muta- 
tion was ascertained by excluding germline changes through 
database search and analysis of DNA from microdissected 
normal cells. 

Data sets and analysis 

A set of 10 DLBCL data sets were used as previ- 
ously described (24), including data sets derived from 
the Gene Expression Omnibus: GSE32918, GSE10846, 
GSE12195, GSE19246, GSE22470, GSE22895, GSE31312, 
GSE34171, GSE4475, (29-37) as well as the data of 
Monti et al (10), and Wright et al (http://llmpp.nih.gov/ 
DLBCLpredictor/) (38). The data set, GSE 10846, was split 
into treatment groups (CHOP/R-CHOP) yielding two data 
sets that were then analysed independently (referred to 
as GSE10846_CHOP and GSE10846_R-CHOP). Data set 
GSE41208 covering progressive changes in gene expression 
during plasma cell differentiation was previously described 
(28). Data generated for this manuscript are available via 
GSE50015 and GSE56857. 

ChlP-seq data analysis and motif detection 

For more detail see Supplementary Methods. Trimmed 
reads were aligned with Bowtie2 (39), and analysed for 
peaks using GEM (40). Peak overlaps were determined us- 
ing a clustering approach such that any peak centre <250 
bp from an index peak centre were considered part of an 
overlapping cluster. De novo motif detection was performed 
with HOMER (41). Displayed motifs are provided as ma- 
trices in Supplementary Table S2. The Broad IGV tool was 
used to display ChlP-seq data (42,43). 

RESULTS 

SPIB, PU.l and IRF4 cis-regulatory occupancy in ABC- 
DLBCL cell lines 

Assessment of endogenous SPIB is essential in order to 
understand how distinct cooperating factors contribute 
to IRF4 regulatory element usage in ABC-DLBCL. We 
therefore raised a polyclonal antibody against the variable 
amino -terminus of the protein, which did not cross-react 
with PU. 1 or SPIC, and was validated in conventional ChIP 
assays (Supplementary Figure S1A-C). We then performed 
ChlP-seq for SPIB, PU. 1 and IRF4 from the ABC-DLBCL 
cell lines OCI-LY3 and OCI-LY10. We identified 6379 and 
13184 IRF4 sites, 2937 and 8904 PU.l sites and 21055 and 
14234 SPIB sites in OCI-LY3 and LY10, respectively (Sup- 
plementary Table SI). 

Occupancy of SPIB was confirmed at known targets 
including BCL2A1, P2RY10, FCRL5, CD36, CD37 and 
CD40 (Figure 1 A and Supplementary Figure S2 and Table 
SI) (16,44-46). SPIB occupancy was identified at promoters 
of genes previously defined as PU. 1 targets, such as MS4A1 
(CD20) (47), a critical element of the B-cell phenotype and 



target of therapeutic monoclonal antibody rituximab, and 
at promoters of several members of dispersed gene fami- 
lies, such as TLR4, TLR7 and TLR9. SPIB binding was 
also identified across clustered gene families, such as SP100, 
SP110 and SP140, and the FCRL1-5 cluster. Important 
regulatory interactions for SPIB have been previously de- 
fined, first, in a positive feedback loop with the transcrip- 
tional factor TCF4 (E2-2) during plasmacytoid dendritic 
cell ontogeny (48,49), and, second, in a negative feedback 
loop with PRDM1 (BLIMP1) during plasma cell differen- 
tiation (13). Consistent with these regulatory interactions 
binding of SPIB to the previously identified PU. 1 /ETS-site 
within the PRDM1 promoter (50), and binding to several 
elements within the TCF4 gene was identified (Supplemen- 
tary Figure S2). Thus, the overall pattern of occupancy de- 
tected for SPIB confirms known regulatory interactions, 
and provides to our knowledge the first genome-wide view 
of SPIB occupancy for the endogenous protein. 

The cistromes of all three assessed transcription factors 
overlapped extensively between the two cell lines. Differ- 
ences primarily derived from absolute numbers of binding 
events in each cell type, thus for IRF4 we observed 95% 
overlap of the LY3 cistrome among that of LY10, for PU.l 
97% overlap of the LY3 cistrome among that of LY10 and 
for SPIB 91% overlap of the LY10 cistrome among that 
of LY3. As expected the cistromes of these factors were 
also highly interrelated within each cell line (Figure IB). 
IRF4 occupancy occurred predominantly in the context of 
one or other ETS-partner, encompassing 87% of the IRF4 
cistrome in OCI-LY3 and 61% in OCI-LY10. More than 
95% of these sites were bound in the presence of SPIB in 
either cell line. Among sites occupied by ETS-factors with- 
out IRF4, PU.l alone made a minor contribution. In con- 
trast, occupancy by SPIB in the absence of PU. 1 was a com- 
mon feature (SPIB.Only 90% in LY3 and 50% in LY10; 
factor names separated by underscores are used to denote 
co-occupancy patterns in the remainder of the manuscript, 
e.g. IRF4 and SPIB co-occupancy = IRF4.SPIB). At co- 
occupied sites the peak centres showed a high degree of 
overlap and for the majority of IRF4 occupied sites the 
nearest peak centre for either SPIB or PU. 1 was within 50 bp 
(Figure 1C). IRF4 occupancy in the absence of ETS-factors 
showed the largest bias towards promoters in both cell lines 
(34%), while other factor combinations showed lesser pro- 
portions of promoter occupancy (12.5-23%). 

For comparison we additionally assessed the H929 
myeloma cell line, which expresses high levels of PU. 1 rela- 
tive to SPIB (Supplementary Figure SIC). ChlP-seq from 
this cell line provides both an assessment of IRF4 occu- 
pancy in a distinct transcription factor context, and an ad- 
ditional control for the specificity of the SPIB antibody 
since the strong expression of PU. 1 relative to SPIB would 
be expected to result in a reversal of the cistrome sizes in 
comparison to the two DLBCL cell lines. In the H929 cell 
line we identified 21946 PU.l, and 19755 IRF4 sites (Sup- 
plementary Figure S3 A and B). In contrast, only 1 193 SPIB 
sites were recovered which is consistent with the low level 
of protein expression. Of the IRF4 cistrome in H929 cells, 
63.5% was occupied by IRF4 in the absence of PU.l or 
SPIB, while 32.7% was occupied by IRF4 and PU.l. This 
represented a significant shift in favour of IRF4 occupancy 
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Figure 1. IRF4, SPIB and PU.1 distribution in ABC-DLBCL. (A) Representative examples of occupancy patterns for IRF4, PU.1 and SPIB in OCI- 
LY3 and OCI-LY10 cells are shown, normalized read-counts/million are indicated to the left of each track. (B) Venn diagrams showing the overlap of 
transcription factor cistromes for the indicated cell lines. (C) Density plots of the distribution of peak centres for SPIB (blue) and PU. 1 (orange) relative 
to IRF4 peak centres. The x-axis shows 500 bp up- and down-stream of IRF4 peak centres at 0. 
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in the absence of an ETS-factor partner in H929 relative to 
OCI-LY3 and LY10 (/?-value = 2.2E-16 chi-squared test). 
SPIB alone or in conjunction with PU. 1 made only a minor 
contribution to IRF4 occupancy in H929 myeloma cells. 

Occupancy confirms motif identity for SPIB and PU.l, and 
SPIB as predominant IRF4 partner at EICEs 

PU. 1 and SPIB have at most subtle differences in preferred 
binding motif when assayed in vitro (8,51). Consistent with 
this de novo motifs identified in regions occupied by SPIB 
alone or in combination with PU. 1 , matched the previously 
defined consensus (Figure 2A and Supplementary Table S2) 
(52). Equally, EICEs were recovered from sites occupied 
by IRF4 and SPIB or PU. 1 (Figure 2B) and at sites occu- 
pied by IRF4 and PU. 1 in the H929 myeloma (Supplemen- 
tary Figure S3C). The detection by de novo motif discovery 
of highly enriched motifs matching the in vitro determined 
consensus for SPIB, and the canonical EICE at co-occupied 
sites, provides further validation of the observed occupancy 
patterns. At IRF4_SPIB_PU.l occupied sites a modestly in- 
creased frequency of additional EICE and ETS motifs (not 
overlapping with an adjacent EICE) was evident within 100 
bp of the peak centre suggesting that some of these regions 
have the potential for combined occupancy by all three fac- 
tors (Supplementary Figure S4). We conclude that in these 
ABC-DLBCL cell lines SPIB is the principle IRF4 partner 
at regulatory regions encompassing EICEs. 

SPIB regulates immune response genes 

We next considered the relationship between local occu- 
pancy by SPIB, PU. 1 and IRF4 (defined as a peak less than 
5 kb upstream or within a gene body) and gene expression in 
each cell line (Figure 3A). The bulk of genes associated with 
local factor occupancy had low median expression values, 
but the distributions were as expected shifted towards pos- 
itive gene expression in each cell line. Overall IRF4 associ- 
ated genes showed the greatest shift towards higher median 
gene expression. 

To further assess the relationship between SPIB occu- 
pancy and gene regulation we performed siRNA knock- 
down in OCI-LY3 and LY10 and assessed gene expression 
at 48 h (Figure 3B), SPIB knockdown was not as well sus- 
tained in OCI-LY10 and we therefore restricted analysis 
to OCI-LY3. Three hundred and six gene probes showed 
significant differences in expression on SPIB knockdown 
(False Discovery Rate (FDR) adjusted p- value < 0.05), 
corresponding to 283 genes (Supplementary Table S3). 
Of genes changing expression following SPIB knockdown 
71/133 downregulated (p-value = 1.43E-10) and 68/150 up- 
regulated (p-value = 1.5E-06) genes were linked to SPIB 
occupancy. Imposing a threshold of > 1.5-fold change in 
expression restricted this to 88 genes changing expression, 
among which 25/46 downregulated (/?-value = 4.06E-05) 
and 18/42 upregulated (/7-value = 1.18E-02) genes were 
linked with local SPIB occupancy. This set of acutely re- 
sponsive target genes, positively controlled by SPIB (Figure 
3B), included the common elements of the ABC-DLBCL 
profile CCND2 and NFKBIZ (1,14,24), as well as estab- 
lished {SELL (Selectin-L/CD62L), CD40) and more re- 



cently defined (FCRL2) regulators of B-cell immune re- 
sponses (53). 

BATF is an IRF4 co-factor in ABC-DLBCL cell lines 

While genomic occupancy in the presence of SPIB repre- 
sented the predominant mode in both OCI-LY3 and LY10 
cells, sites occupied by IRF4 in the absence of SPIB and 
PU.l were of particular interest as these were likely to 
include regions bound by IRF4 in the context of addi- 
tional cofactors, such as BATF. This component of the 
IRF4 cistrome included 839 peaks (13%) in OCI-LY3, but 
a greater proportion in OCI-LY10, 5164 peaks (39%) (Fig- 
ure IB). Notably, de novo motif detection either from the 
complete IRF4 cistrome from both ABC-DLBCL cell lines 
or the IRF4_Only peak subsets identified motifs matching 
AICEs (Figure 4A and Supplementary Figure S5A). These 
included both AICE-1 variants which place the API com- 
ponent of the motif 5' of the 'GAAA sequence bound by 
IRFs with a 4-base spacing, and the AICE-2 variant in 
which the orientation is inverted and the core IRF site is 
immediately 5' of the API site. 

We recently used a comparative analysis of gene expres- 
sion across 10 DLBCL data sets to establish the genes most 
consistently associated with ABC- and GCB-DLBCL (24). 
BATF was among the 24 genes that were associated with the 
ABC-class in all data sets. We therefore performed EMSA 
to assess the potential for BATF to form DNA-binding 
complexes with IRF4 in the OCI-LY3 and LY10 cell lines. 
Probes encompassing AICEs associated with SETBP1 and 
FOX03 genes both generated a dominant complex which 
was super-shifted by IRF4 or BATF antibodies (Figure 4B). 
As expected, mutation of the API element of the consen- 
sus eliminated the formation of this cocomplex. A greater 
residual complex was observed on IRF4 antibody super- 
shift, particularly in OCI-LY3 cells, and expression of IRF8 
may provide an explanation for this observation as this fac- 
tor can also form complexes with BATF at AICEs (20,21). 

To assess the contribution of BATF to IRF4 binding 
more generally we performed ChlP-seq for BATF from 
both ABC-DLBCL cell lines (Figure 5A). This identified 
a total of 4735 and 10367 BATF peaks in OCI-LY3 and 
LY10, respectively (Supplementary Table S4). These over- 
lapped with IRF4 peaks both in the absence and in the pres- 
ence of SPIB and PU. 1 (Figure 5B), while a lesser fraction of 
the BATF cistrome overlapped with SPIB or PU. 1 in the ab- 
sence of IRF4. Among peak regions occupied by BATF in 
both cell lines classical API motifs and AICEs were identi- 
fied as enriched in de novo motif discovery (Figure 5C). We 
manually validated a selection of 1 1 peak regions charac- 
terized by different occupancy patterns for IRF4, SPIB and 
BATF using ChlP-qPCR. These results verified the detec- 
tion of different occupancy patterns (Figure 6). 

The position of IRF4 and BATF peak summits at co- 
occupied sites showed similar distributions (Supplementary 
Figure S5B). To evaluate motif usage at peak regions oc- 
cupied by IRF4 with SPIB, PU.l and BATF (IRF4.SPIB- 
PU.1_BATF), or BATF alone (IRF4.BATF) we considered 
the 200 most significant peak regions. IRF4_BATF occu- 
pied regions were characterized by the expected API and 
AICE motifs and rarely contained matches to either EICEs 
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or ETS motifs. In contrast, IRF4_SPIB-PU.1_BATF occu- 
pied regions contained matches to EICE, ETS, API and 
AICE motifs but few individual peak regions contained 
matches to all four motifs (Supplementary Figure S5C). 
Overall within this subset of multiply co-occupied peaks a 
higher frequency of EICEs was observed relative to AICEs 
indicating that SPIB is likely to provide the most common 
direct IRF4 cofactor at these multiply co-occupied sites in 
OCI-LY3 and LY10 cells. 

Distinct IRF4 occupancy patterns relate to cofactor position- 
ing in Epstein-Barr virus (EBV) LCLs and ABC-DLBCL 
cell lines 

Signalling via the LMP1 and LMP2A proteins plays a criti- 
cal role in EBV lymphoblastoid transformation (54). These 
viral proteins provide mimics of constitutive CD40 and B- 
cell receptor signalling, corresponding to two critical path- 
ways of oncogene activation in ABC-DLBCL (2). Lym- 



phoblastoid cell lines, and in particular the ENCODE data 
derived from GM 12878 LCLs (55), thus provide the oppor- 
tunity for relevant comparison to ABC-DLBCL. We there- 
fore assessed the IRF4, BATF and SPIB cistrome in OCI- 
LY3 and LY10 cells, selecting the dominant ETS factor for 
balanced data set number, against the IRF4, BATF and 
PU.l cistromes in ENCODE GM12878 data (55). The dis- 
tributions of SPIB binding in the LY3 and LY10 cell lines 
correlated most significantly with that of PU. 1 in GM 12878, 
while IRF4 binding in LY3 and LY10 cell lines correlated 
most highly with that of SPIB, and weakly with the distri- 
bution of PU.l in GM12878 (Figure 7A). In contrast, the 
IRF4 distribution in GM 12878 correlated with BATF and 
to a lesser extent with PU.l in the matching cell line (56). 
Thus, the overall positioning of IRF4 in two related con- 
texts of post-germinal centre B-cell transformation shows 
distinct linkage to either SPIB/PU.l or BATF centred dis- 
tributions. 
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Figure 3. Gene expression associated with SPIB occupancy. (A) Gene expression values for OCI-LY3 (left panel) and LY10 (right panel) are shown as 
density plots, with the median and interquartile ranges (25th-75th) shown as a central line for all genes (Total) or genes associated with local factor 
occupancy (peak ±2kb of TSS) as shown on the y-axis, with log2 expression values on the x-axis. (B) Summary of SPIB knockdown results, representative 
western blot of SPIB knockdown in OCI-LY3 cells is shown on the left. The intersection of genes showing significant downregulation (adjusted /?-value 
<0.05, > 1.5-fold change) following 48 h SPIB knockdown with those also showing local SPIB occupancy (5 kb upstream/intragenic) is illustrated in the 
diagram on the right. 



SPIB is the functionally dominant IRF4 co-factor in the 
OCI-LY3 ABC-DLBCL cell line 

The overall frequency of IRF4 and SPIB co-occupancy, and 
the genome-wide correlation analysis supported a domi- 
nant role for SPIB in determining IRF4 occupancy in the 
ABC-DLBCL cell lines. However, the extent to which SPIB 
provided an essential determinant of IRF4 DNA-binding 
in this context was uncertain; to address this question, we 
knocked-down SPIB expression in OCI-LY3 cells, to a de- 
gree sufficient to impact on both SPIB and IRF4 binding at 
a selected positive control site (Figure 7B), and performed 



ChlP-seq for SPIB, PU.l and IRF4. Overall we observed 
a highly significant loss of SPIB but not PU. 1 occupancy 
genome-wide, further validating the specificity of our SPIB 
ChlP-seq data (Figure 7B and C). Although a subset of reg- 
ulatory elements did show a reciprocal increase in PU. 1 oc- 
cupancy on SPIB depletion, such as a regulatory element 
near CD28 shown in Figure 7D, generally PU.l did not 
compensate for SPIB depletion by increased binding (Fig- 
ure 7C). In contrast, depletion of SPIB was accompanied 
by a genome-wide loss of IRF4 occupancy. Even among 
those sites bound by IRF4 in the absence of either ETS- 
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factor partner an overall loss of IRF4 occupancy was ob- 
served, which contrasted with the absence of any impact on 
IRF4 mRNA expression on SPIB knockdown (Figure 7C 
IRF4_Only and Supplementary Table S3). However, using 
the set of regulatory elements bound by IRF4 in the absence 
of either PU.1 or SPIB for comparison (IRF4_Only), the 
loss of IRF4 binding on SPIB knockdown was significantly 
greater at sites co-occupied by IRF4 in the presence of 
SPIB irrespective of PU.1 co-occupancy (IRF4_SPIB_Only, 
rvalue = 6.85E-28; IRF4_SPIB_PU.l, Rvalue = 5.47E-26). 
Furthermore, although genome-wide occupancy by IRF4 
was responsive to loss of SPIB, a small fraction (~8%) 
of all IRF4 occupied sites was unaffected by SPIB deple- 
tion (fold-change <1.4). This stable subset of IRF4 occu- 
pied sites was significantly enriched for regulatory elements 



bound by IRF4 in the absence of SPIB (43%, /?-value = 
1.23E-29). Thus, in the OCI-LY3 ABC-DLBCL cell line, 
PU. 1 fails to compensate acutely for SPIB depletion and a 
general-shift towards an alternate pattern of BATF-centred 
IRF4 occupancy is not observed. However, while IRF4 oc- 
cupancy is globally responsive to SPIB knockdown, those 
sites occupied by IRF4 in the absence of SPIB are as ex- 
pected most resilient. We conclude that in this context SPIB 
provides the functionally dominant determinant of IRF4 
genomic occupancy, and neither PU.1 nor BATF acutely 
compensate to maintain IRF4 occupancy or drive redistri- 
bution of IRF4 to a different occupancy pattern. 
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BATFJRF4 SPIBJRF4 SPIB_Only 

Figure 6. Differential occupancy patterns are confirmed by manual validation. Shown are representative results for BATF, SPIB and IRF4 ChIP using 
qPCR at selected targets representative of IRF4_BATF, IRF4_SPIB and SPIB_Only occupancy patterns as indicated at the bottom of the figure. Results are 
shown as fold enrichment relative to control IgG on the y-axis, promoter regions are indicated using official gene symbol, while other regulatory elements 
are indicated by genomic position (hgl9) of the forward primer. Results are representative of two independent experiments from different chromatin batches 
derived from OCI-LY10 cells. 
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Figure 7. SPIB can act as a functionally dominant IRF4 partner in ABC-DLBCL. (A) IRF4 shows SPIB or BATF predominant genome-wide positioning 
in different contexts of post-germinal centre transformation. The genome-wide distribution of BATF, IRF4 and SPIB in OCI-LY3 and LY10 was compared 
to the distribution of BATF, IRF4 and PU.1 in ENCODE GM 12878 data by Pearson's correlation. The pairwise correlations of occupancy determined 
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SPIB knockdown at 48 h was confirmed by western blot, and the degree of knockdown was confirmed by densitometry relative to ACTIN loading control, 
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SPIB occupancy is linked to genes overexpressed in primary 
ABC-DLBCL with high SPIB and low BATF expression 

The intensity of SPIB expression has been linked to genomic 
amplification or translocation of chrl9 in ABC-DLBCL 
(14), and SPIB shows a less consistent differential expres- 
sion between GCB- and ABC-DLBCL than BATF (24). We 
reasoned, therefore, that the relative expression of BATF 
and SPIB might contribute to heterogeneity of tumour bi- 
ology among ABC-DLBCL. 

Although the classification of DLBCL into cell of origin 
classes has provided a central framework for understand- 
ing this disease, a number of different algorithms have been 
used to implement the classification in different studies. We 
recently described a detailed evaluation of classifier algo- 
rithms across a range of available DLBCL gene expression 
data sets (24). We established a robust platform independent 
classifier tool, the DLBCL Automatic Classifier (DAC), 
which allows consistent classification of multiple data sets 
and is effective on data generated both from fresh frozen 
and formalin-fixed paraffin embedded samples. With this 
tool we previously performed a meta-analysis of gene ex- 
pression across 10 uniformly classified DLBCL gene expres- 
sion data sets (24). To address whether the relative expres- 
sion of BATF and SPIB might contribute to disease hetero- 
geneity, we first examined the pairwise correlation of these 
transcription factors in ABC-DLBCL using the cases con- 
tained in the 10 publically available data sets and the classi- 
fications of these cases that we have previously established 
using DAC (24). In this analysis, one data set (GSE19246) 
emerged as a consistent outlier and was therefore excluded 
from further assessment. Each transcription factor pair- 
ing showed evidence of a modest positive correlation; how- 
ever, overall there was a greater positive correlation of IRF4 
and BATF (average Spearmann's correlation = 0.53 ± 0.08) 
than SPIB and BATF (average Spearmann's correlation = 
0.40 ± 0.09) or IRF4 and SPIB (average Spearmann's cor- 
relation = 0.42 ± 0.09) (Supplementary Figure S6). 

We reasoned that since SPIB and BATF provide distinct 
regulatory information, the observed variability in BATF 
and SPIB mRNA expression in ABC-DLBCL might be 
associated with differences in disease biology. To address 
this we separated the ABC-DLBCL cases into four groups, 
by using a contingency table approach divided by high 
and/or low expression of BATF and SPIB mRNA (top and 
bottom 50% of expression as threshold). We then deter- 
mined differential gene expression (/?-value < 0.05) between 
ABC-DLBCL cases characterized by the two extremes of 
high- SPIB and low-BA Inversus high-BA TF and \ow-SPIB 
mRNA expression in each data set. We subsequently re- 
fer to these subgroups as SPIB high /BATF low -ABC-DLBCL 
and SPIB low /BATF high -ABC-DLBCL. To identify genes 
consistently associated with either of these two extremes we 
used a threshold of differential expression in four or more 
data sets, and refer to the resulting lists as 'meta-profiles'. 
In this pairwise comparison a total of 198 genes were iden- 
tified as overexpressed in SPIB high /BATF low -ABC-DLBCL 
and 237 genes in SPIB low /BATF high -ABC-DLBCL (Sup- 
plementary Table S5). 

As an approximate assessment of the relationship be- 
tween SPIB, BATF and IRF4 expression in OCI-LY3 and 



LY10 cell lines and primary ABC-DLBCLs we superim- 
posed the normalized expression values for these cell lines 
onto the distributions derived for primary ABC-DLBCL 
across all data sets. With the caveat that gene expression as- 
sessments from primary tumour samples derive from mixed 
cell types, this confirmed that OCI-LY3 and LY 10 fell within 
the general distribution of expression values for BATF and 
IRF4, and at the high end of the SPIB distribution (Fig- 
ure 8A). We then assessed the overlap between local SPIB 
and BATF occupancy in OCI-LY3 and LY10 cells and the 
meta-profiles of SPIB high /BATF low and SPIB low /BATF high - 
ABC-DLBCL (Figure 8B). Genes with occupancy by SPIB 
or BATF, within the gene body or 5 kb upstream, were 
significantly enriched among both SPIB high /BATF low and 
SPIB l0W /BATF high -ABC-DLBCL meta-profiles. However, 
SPIB occupancy showed a substantially more significant 
enrichment in the SPIB high /BATF low meta-profile (>-value 
= 3.43E-23), than the SPIB low /BATF high meta-profile fa- 
value = 4.47E-11). In contrast, BATF occupancy showed 
only a minor difference in enrichment between the two 
meta-profiles (SPIB high /BATF low meta-profile /rvalue = 
2.62E-09 versus SPIB low /BATF high meta-profile ^-value 
= 9.77E-08). This supports a direct regulatory contribu- 
tion by SPIB to preferential gene expression in primary 
SPIB high /BATF l0W -ABC-DLBCL. 

SPIB high /B ATF low and SPIB low /BATF high -ABC-DLBCL 
are reciprocally linked to distinct stages of B-cell differenti- 
ation 

The biology of ABC-DLBCL is related to cells trapped 
in abortive plasma cell differentiation. We noted that dur- 
ing in vitro B-cell differentiation to plasma cells, BATF 
expression was induced in activated B-cells prior to the 
loss of B-cell phenotype, while SPIB is modestly reduced 
in activated B-cells and repressed upon transition to plas- 
mablasts (Supplementary Figure S7). We, therefore, con- 
sidered that the differences in gene expression linked to 
the subgroups of ABC-DLBCL defined by relative SPIB 
and BATF expression might also relate to different stages 
of B-cell to plasma cell differentiation. To address this 
we intersected the meta-profiles for SPIB high /BATF low 
and SPIB l0W /BATF high -ABC-DLBCL with gene expression 
data, derived from an in vitro model we have recently de- 
veloped, spanning the in vitro differentiation of resting hu- 
man B-cells to long-lived plasma cells (28). Notably, both 
meta-profiles were significantly enriched for genes show- 
ing dynamic expression changes during B-cell to plasma 
cell differentiation (54/193 SPIB high /BATF low /?-value = 
2.14E-11, 62/235 SPIB low /BATF high /?-value = 1.15E-11). 
Furthermore, the meta-profiles were also skewed relative 
to the B-cell differentiation time course (Figure 8C and 
D). SPIB high /BATF l0W -ABC-DLBCL was positively asso- 
ciated with genes with maximal expression in B-cells (Z- 
score = +6.1, ^-value = 0) and significantly depleted of 
genes expressed at later stages of differentiation in partic- 
ular the in vitro activated B-cell state (AB genes Z-score = 
-3.7, ^-value = 2.4E-06). In contrast, SPIB low /BATF high 
ABC-DLBCL showed a reciprocal pattern of association 
with significant enrichment of in vitro activated B-cell 
genes (Z-score = 2.97, ^-value = 0.003) and significant 
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Figure 8. SPIB high /BATF low and SPIB low /BATF high -ABC-DLBCL are differentially associated with transcription factor occupancy and stages of B-cell 
differentiation. (A) The normalized expression values for SPIB, BATF and IRF4 mRNA in OCI-LY3 and LY10 cells are shown relative to expression of 
these factors in primary ABC-DLBCL across 9 gene expression data sets. The scatter plots illustrate pairwise comparisons as indicated by arrow labels, 
x- and y-axis represent normalized expression values as z-scores. The correlation is indicated as a line with 95% confidence interval as shading, and the 
expression values for OCI-LY3 and OCI-LY10 are shown as red and blue spots, respectively. (B) The enrichment of genes (hypergeometric test) with local 
SPIB and BATF occupancy (-5 Kb or intragenic) in OCI-LY3 and LY10 is shown, as indicated on the left of the bar graph, among the meta-profiles for 
SPIB high /BATF low (orange bars) and SPIB low /BATF high -ABC-DLBCL (yellow bars). The bar graph illustrates the Logi 0j p-value to the left and Z-score 
to the right on the x-axis. (C) The enrichment or depletion of genes maximally expressed at different stages of in vitro human B-cell differentiation to 
the plasma cell stage, among SPIB lligll /BATF low (orange bars) and SPIB low /BATF high -ABC-DLBCL (yellow bars) meta-profiles was determined using a 
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plasmablast (PB) and plasma cell (PC) as indicated from left to right. The dotted line represents /?-value of 0.05 (D) The SPIB hlgh /BATF low (orange bar) and 
SPIB low /BATF lllgll -ABC-DLBCL (yellow bar) meta-profiles are shown as Wordles with the degree and consistency of differential expression represented 
by font size, also indicated is the enrichment of meta-profile genes among genes showing dynamic expression during B-cell terminal differentiation in the 
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depletion of genes expressed in resting B-cells (Z-score 
= —3.33, /?-value = 7.05E-05). However, there was no 
significant difference, between the SPIB hi § h /BATF low and 
SPIB low /BATF high -ABC-DLBCL subgroups, in the expres- 
sion of classifier genes used to establish the ABC versus 
GCB-DLBCL classification or the DAC-classification con- 
fidence, which provides an overall assessment for each case 
of the likelihood of belonging to one of the classes of the cell 
of origin classification (Supplementary Table S6). We con- 
clude that these two subgroups show similar expression of 
the principle classifier genes used to identify ABC-DLBCL 
but differ in their relationship to stages of B-cell differen- 
tiation: the SPIB high /BATF low -ABC-DLBCL subgroup is 
characterized by preferential retention of genes expressed in 
resting B-cells, while SPIB low /BATF high -ABC-DLBCL dis- 
plays a more exaggerated similarity to in vitro activated B- 
cells. 

SPIB high /B ATF low and SPIB low /BATF hi § h ABC-DLBCL 
are linked to distinct gene sets 

In order to gain further insight into the potential re- 
lationships of SPIB high /BATF low and SPIB low /BATF high - 
ABC-DLBCL and previously defined molecular path- 
ways, we performed an analysis of gene signature en- 
richment using a hypergeometric test. The extensive com- 
pendium of gene signatures tested were derived from Gen- 
eSigDB, MSigDB, Staudt, Shipp and Du laboratories, 
and were filtered for gene signatures of less than 1000 
genes in size (13978 in total) (34,57-60). After correc- 
tion for false discovery there remained an extensive list 
of enriched signatures for both meta-profiles. At FDR 
corrected p- value < 0.001, 109 and 281 gene signatures 
overlapped significantly with the SPIB high /BATF low and 
the SPIB low /BATF high meta-profiles, respectively (Supple- 
mentary Table S7). The SPIB high /BATF low meta-profile 
showed most significant overlap with signatures of the 
resting B-cell state (e.g. Pan_B_U133plus, FDR ^-value 
= 2.37E-15, Blood_Module-1.3_B .cells, FDR /?-value = 
1.72E-11), but also showed enrichment of signatures re- 
lated to the activated B-cell lymphoma class (e.g. ABCgt- 
GCB.U133AB, FDR /?-value = 1.24E-09) as well as sev- 
eral signatures related to plasmacytoid dendritic cells (e.g. 
Dendritic_cell_CD123pos_blood, FDR /?-value = 2.2 IE- 10; 
GSE29618_PDC_VS_MDC_UP, FDR /?-value = 4.13E-10). 
In relation to SPIB itself, the SPIB high /BATF low meta- 
profile was notably enriched both for an external signa- 
ture related to evolutionarily conserved PU.l/SPIB mo- 
tifs in gene promoters (MSigDB_C3 signature RGAG- 
GAARY_V$PU1_Q6, FDR /7-value = 9.42E-06) and 
importantly also for expression of genes in chrl9ql3 
(MSigDB_Cl signature chrl9ql3, FDR /?-value = 8.38E- 
05), the cytoband containing the SPIB gene, while no other 
cytoband showed enrichment at FDR corrected /?-value < 
0.001 for either meta-profile. Interestingly, when consid- 
ering genes 2Mb either side of the SPIB transcriptional 
start site, enrichment was exclusively observed for genes 
upstream/centromeric to SPIB (Supplementary Table S7). 
Coexpression of genes associated with chrl9 amplification, 
as described by Lenz et al (14), in the SPIB high /BATF low 
meta-profile suggests that such amplification is a contribut- 



ing factor to the pathogenesis of this subgroup identifiable 
from expression profiles. 

In contrast, the SPIB low /BATF high meta-profile 
was most significantly enriched for genes asso- 
ciated with STAT3 activation in ABC-DLBCL 
(STAT3high_ABC_DLBCL_subgroup, FDR /?-value = 
2.85E-17), while among other signaling pathway signatures 
those linked to nuclear factor kappa-light-chain-enhancer 
of activated B cells (NFkB) in ABC-DLBCL were also 
significantly enriched (e.g. NFKB_UP_BCR_paper, FDR 
^-value = 2.22E-08; BASSO_CD40_SIGNALING_UP, 
FDR ^-value = 6E-05) (Supplementary Table S7). This 
is also consistent with the enrichment of genes at the 
activated B-cell stage of in vitro differentiation (Figure 8D), 
since this is driven by addition of IL21, a potent STAT3 
activator, and CD40 ligation. However, it is worth noting 
that the SPIB low /BATF high meta-profile does not overlap 
with all STAT 3 related signatures to a similar degree. Genes 
included in a recent expression-based signature of STAT3 
activation, in DLBCL as a whole (61), showed no signif- 
icant enrichment in the SPIB low /BATF high meta-profile 
(HUANG_PY_STAT3 _Total, overlap 2/32, FDR Rvalue 
= 0.2, or HUANG_PY_STAT3_llSig, overlap 0/11 genes, 
FDR /?-value = 0.95). We conclude that separating the 
ABC-DLBCL subset according to relative expression of 
the two IRF4 cofactors SPIB and BATF identifies sub- 
groups differentially linked to previously defined features 
of DLBCL tumour biology. 

High SPIB expression is linked to a group of ABC-DLBCL 
with good clinical outcome 

We noted that the SPIB high /BATF low -ABC-DLBCL sub- 
group in our U.K. population-based patient cohort 
of DLBCL treated with R-CHOP chemotherapy (62), 
(GSE32918), was characterized by a relatively good sur- 
vival and included a subgroup of patients with survival be- 
yond 5 years (Figure 9A). Since case numbers were limit- 
ing we next asked whether a similar association between 
survival and differential SPIB and BATF expression could 
be observed in other data sets of cases treated with sim- 
ilar immunochemotherapy. We first examined this associ- 
ation in the data set GSE 10846, generated by the Lym- 
phoma Leukemia Molecular Profiling Project (LLMPP), 
which is the largest available data set derived from fresh 
frozen, rather than formalin-fixed paraffin embedded ma- 
terial, and has provided a reference data set for the asso- 
ciation between gene expression profiles and clinical out- 
come in DLBCL in the era of immunochemotherapy (29). 
In this data set SPIB high /BATF low -ABC-DLBCL was simi- 
larly characterized by a good outcome (Figure 9B). 

Since the choice of algorithm used to implement the cell 
of origin classification does affect the classification of a 
subset of cases with marginal expression values for classi- 
fier genes (24), we also addressed whether the association 
of outcome with the differential SPIB/ BATF expression 
was affected by the use of our implementation of the cell 
of origin classifier. However, this was not the case since a 
similar good outcome was observed for SPIB hlgh /BATF low - 
ABC-DLBCL cases when using the pre-assigned classes for 
GSE 10846 in the Gene Expression Omnibus (Supplemen- 
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Figure 9. SPIB and BATF expression is linked to outcome and MYD88 mutation status in ABC-DLBCL. (A) Shows Kaplan-Meier analysis for overall 
survival of SPIB high /BATF low (blue line), compared to SPIB low /BATF high -ABC-DLBCL (red line) cases in data set GSE32918. (B) Displays the Kaplan- 
Meier analysis for overall survival of SPIB lligll /BATF low (blue line), compared to SPIB low /BATF lligh -ABC-DLBCL (red line) cases in data set GSE10846 R- 
CHOP component, using cases classified as ABC-DLBCL with our implementation of the cell of origin classification, DAC. (C) Illustrates the enrichment 
of MYD88-L265P mutations among the four ABC-DLCBL subgroups defined by high/low SPIB and high/low BATF mRN A expression. The y-axis 
represents the log2 of the />-value of enrichment. Significant enrichment is only observed in the SPIB hlgh /BATF low subgroup indicated by a star. (D) 
Illustrates Kaplain-Meier analysis of overall survival for ABC-DLBCL cases in data set GSE32918, divided according to MYD88 mutation status, wild 
type (black line), MTO&S-L265P mutation and SPIB lligh /BATF low expression profile, or all other MTO55-L265P mutated ABC-DLBCL cases (red line). 
MYD88 mutations other than L265P, detected by Sanger sequencing, of which there were three cases were included in the 'wild type' category to reflect a 
clinical scenario of targeted MYD88-L265P mutation detection. 



tary Figure S8). This was consistent with the fact that the 
assignments of DLBCL to cell of origin classes differ for 
only a minority of cases overall between our classifications 
and those previously assigned to cases in GSE 10846 (24). 

In contrast to the concordant results observed be- 
tween the LLMPP data, GSE10846 (29), and our U.K. 
population-based data, GSE32918 (62), when we examined 
a separate data set of R-CHOP treated DLBCL cases gener- 
ated by The International DLBCL Rituximab-CHOP Con- 
sortium Program (GSE31312) (37), we found no signifi- 



cant difference in survival between SPIB high /BATF low and 
SPIB low /BATF high -ABC-DLBCL. Nonetheless, the con- 
cordant results observed in GSE 10846 and GSE32918 in- 
dicate that high SPIB and low BATF expression can iden- 
tify a subgroup of ABC-DLBCL cases with good outcome, 
which in the context of our U.K. population-based cohort 
includes a small subset of ABC-DLBCL cases treated with 
R-CHOP displaying 5 year or greater survival. 
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SPIB high /BATF low -ABC-DLBCL is linked to MYD88 muta- 
tion status and the combination identifies a group with distinct 
favourable outcome 

MYD88 mutation is an oncogenic event strongly associ- 
ated with ABC-DLBCL and Waldenstrom macroglobuline- 
mia, the recurrent L265P mutation of MYD88 accounts 
for the majority of this association (63,64). MYD88 is a 
principle signal transduction component downstream of 
TLRs and IL1R. A functional linkage between SPIB ex- 
pression and TLR/MYD88 pathway activation has been 
identified by Yang et al. in which SPIB represses autocrine 
IFN secretion allowing ABC-DLBCL survival in the con- 
text of MYD88 mutation (16). At the same time, while the 
presence of a MYD88 mutation may promote receptor- 
independent signalling (63), TLR activation also con- 
tributes to signal transduction in the context of MYD88- 
L265P in ABC-DLBCL and in B-cells engineered to express 
mutant MYD88 (65,66). The M1TOS-L265P mutation is 
present in both OCI-LY3 and LY10 cells (63), and we noted 
that SPIB binding was present in the promoters of TLR4, 
7 and 9 and in the vicinity of MYD88 itself in these cells. 
We, therefore, asked whether an association between SPIB 
expression and MYD88 mutation might also be observed 
in primary ABC-DLBCL. We examined the MYD88 muta- 
tion status of ABC-DLBCL cases in our cohort (62), and 
found a statistically significant association between cases 
with high SPIB and low BATF expression and the pres- 
ence of a MYD88 mutation in general (7/9 cases, /?-value 
= 0.015), or the common MYD88-L265P mutation in par- 
ticular (6/9 cases, ^-value = 0.03) (Figure 9C and Supple- 
mentary Table S8). In contrast, there was no significant as- 
sociation between MYD88 mutation status and any of the 
other combinations of SPIB and BATF expression, among 
which MYD88 mutations were randomly distributed. No- 
tably, among the six SPIB high /BATF low -ABC-DLBCL pa- 
tients with MYD88 mutations all but one survived for 5 
years or more, the patient who died during follow-up was 
an 85-year-old who survived for 2.5 years but had not re- 
ceived treatment with curative intent. Thus, the identifica- 
tion of SPIB hi § h /BATF l0W mRNA expression may provide 
a tool to identify a subset of ABC-DLBCL patients with 
MrD##-L265P mutation who have a good response to cur- 
rent immunochemotherapy. 

DISCUSSION 

IRF4 is at the centre of both the transcriptional program of 
B-cell terminal differentiation and of ABC-DLBCL. IRF4 
engages in cooperative interactions with different transcrip- 
tion factor partners at distinct DNA elements (5,67), pro- 
viding the basis for varied transcriptional input across cell 
states. Our findings suggest that the balance of IRF4 part- 
ner expression between SPIB and BATF identifies distinct 
subgroups of ABC-DLBCL linked to different stages of B- 
cell differentiation, oncogenic pathway activation and clin- 
ical outcome. 

While several potential IRF4 transcription factor part- 
ners have been described (5), recent genome-wide studies 
have so far identified three predominant modes of IRF4 
DNA-binding: at EICEs (16), at AICEs (18-20) and at 



repeats of the IRF 'GAAA core consensus matching the 
ISRE pattern (18,68). A shift in favour of IRF4 binding at 
ISRE-like sequences has recently been identified as a tran- 
sition point during plasma cell differentiation (69). These 
modes of DNA-binding by IRF4 are not mutually exclu- 
sive. In the OCI-LY3 and LY10 ABC-DLBCL cell lines, mo- 
tif enrichment provided evidence for all three patterns of 
occupancy with EICEs predominating over co-occupancy 
with BATF at AICEs or occupancy at ISRE containing se- 
quences. In the myeloma cell line H929, IRF4 occupancy 
in the context of AICEs was not observed, which is most 
likely to reflect absent or low BATF expression. While PU. 1 
expression in H929 cells resulted in frequent occupancy of 
IRF4 at EICEs, nonetheless IRF4 occupancy in the absence 
of PU. 1 at regulatory elements characterized by ISRE or 
simple 'GAAA core elements was most frequent. Of note, 
PU. 1 expression is not uniform across primary myelomas or 
myeloma cell lines, suggesting the potential for significant 
variation in transcriptional input from IRF4 in plasma cell 
malignancies, which will be important to explore in future. 

Frequent IRF4 occupancy at regulatory elements con- 
taining EICEs was recently described in the HBL1 ABC- 
DLBCL cell line by Yang et al. (16). This study addition- 
ally examined SPIB occupancy, and thus identified a central 
role for IRF4.SPIB heterodimers in several aspects of ABC- 
DLBCL biology. However, the use of HBL1 cells engineered 
to express biotin-tagged SPIB meant that the relative con- 
tribution of endogenous SPIB or PU. 1 to IRF4 occupancy 
was not established. Here, we have addressed this issue and 
our data put the role of IRF4_SPIB heterodimers in ABC- 
DLBCL in a new context. In cell lines with strong SPIB 
expression this association is both quantitatively and func- 
tionally dominant, as shown by the fact that SPIB knock- 
down in the OCI-LY3 cell line leads to extensive loss of 
IRF4 DNA-binding, without compensation by PU. 1 or re- 
distribution of IRF4 to a more BATF-centred distribution. 
However, our data also show that input from SPIB is mod- 
ified by a significant contribution from BATF both in the 
context of combined occupancy of regulatory elements by 
BATF, IRF4 and SPIB and as an independent IRF4 cofac- 
tor. 

Comparison to ENCODE data demonstrates that the 
pattern of IRF4 occupancy in ABC-DLBCLs differs sig- 
nificantly from that in the EBV lymphoblastoid cell line 
GM 12878. In the latter cell line a dominant contribution 
is made by BATF to IRF4 occupancy and BATF emerges 
as more highly correlated with IRF4 than PU.l genome- 
wide (56). Thus, the GM 12878 LCL provides a contrasting 
instance in which BATF is the predominant IRF4 partner 
in a transformed post-germinal centre B-cell. Interestingly, 
EBV-derived transcription factors expressed in LCLs have 
been recently shown to extensively cooccupy regulatory ele- 
ments bound by BATF, IRF4 and PU.l (70,71), and BATF 
has been previously identified as a direct target induced 
by EBNA2 (72). It will therefore be important to estab- 
lish whether a BATF-centred IRF4 distribution can occur 
during B-cell activation or in primary SPIB low /BATF hi § h - 
ABC-DLBCLs, or whether this pattern of occupancy ob- 
served in GM 12878 cells reflects a regulatory state specific 
to EBV transformation. If so this would be predicted to 
pertain as the predominant mode of IRF4 occupancy in 
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EBV-driven B-cell malignancies, such as EBV-associated 
DLBCL and EBV-associated classical Hodgkin lymphoma. 
Together the data are consistent with a general model of 
context-dependent IRF4 activity, and indicate that IRF4 
expression in post-germinal centre B-cells malignancies can 
be linked to several quite distinct transcriptional states. 

It is inevitable that a disease category, such as ABC- 
DLBCL, encompasses heterogeneity, but this is particu- 
larly relevant where one of the principle classifier genes 
used to establish the category encodes a transcription fac- 
tor that can display the wide range of cis-regulatory oc- 
cupancy observed for IRF4. The interest of analysing the 
nature of this heterogeneity lies on the one hand in iden- 
tifying significant differences in clinical outcome and on 
the other in what such heterogeneity indicates in relation 
to disease biology. The biological validity of subdividing 
ABC-DLBCL based on SPIB and BA TF expression is sup- 
ported by the reciprocal association of the resulting sub- 
groups with genes linked to distinct stages of B-cell differ- 
entiation. That SPIB high /BATF low -ABC-DLBCL is more 
significantly associated with genes expressed in resting B- 
cells is generally consistent with what is known of the nor- 
mal function and expression pattern of SPIB which has 
previously been identified as a regulator of B-cell signal- 
ing pathways and a repressor of plasma cell differentiation 
(10-13). Furthermore, the significant enrichment of genes 
on chrl9 in the vicinity of SPIB, including genes previously 
identified as coordinately overexpressed in ABC-DLBCLs 
with amplification of chrl9 by Lenz et al (14), is consistent 
with chrl9 amplification providing a pathogenetic mecha- 
nism in the SPIB high /BATF low -ABC-DLBCL subgroup. In 
contrast, the enrichment in the SPIB low /BATF high - ABC- 
DLBCL meta-profile of genes expressed in B-cells fol- 
lowing IL21 and CD40L activation, STAT3-high ABC- 
DLBCL and NFkB and CD40 activation, points to com- 
bined activation of the STAT3 and CD40/NFkB path- 
ways as likely mechanisms driving this subgroup. That 
SPIB l0W /BATF high -ABC-DLBCL does not simply reflect a 
surrogate for STAT3 activation alone is indicated by the 
lack of enrichment of two signatures recently described 
as predictors of STAT3 activation in DLBCL as a whole 
(61). BATF has been previously identified as a target of the 
NFkB and STAT3 pathways in other cell systems (73-76), 
while IRF4 is seen as a principle target of NFkB activity in 
B-cell differentiation (77). In T-cells BATF and IRF4 coop- 
erate with STAT3 acting potentially as pioneer factors (18). 
These transcription factors are thus likely to provide impor- 
tant hubs for signal integration both in post-germinal cen- 
tre B-cell neoplasia and at the initiation of B-cell terminal 
differentiation. 

From the point of view of clinical significance we have 
shown here that a high expression of SPIB and low ex- 
pression BATF mRNA can identify a good prognostic 
group of DLBCL when treated with currently standard im- 
munochemotherapy, R-CHOP. While this association could 
be observed in two of the existing data sets of R-CHOP 
treated DLBCL including the largest data set generated 
from fresh-frozen samples by the LLMPP (GSE 10846) 
(29), and our population based cohort (GSE32918), it was 
not reproduced in the data set generated by The Inter- 
national DLBCL Rituximab-CHOP Consortium Program 



(GSE31312) (37). The latter includes the largest number of 
R-CHOP treated DLBCL cases analysed by gene expres- 
sion profiling to date, and was generated on the same plat- 
form as the LLMPP data set, GSE 10846, but derives from 
formalin-fixed paraffin embedded rather than fresh frozen 
samples. The reason why a good outcome group of ABC- 
DLBCL could not be identified from relative SPIB and 
BATF mRNA expression in GSE31312 could not be ascer- 
tained from the gene expression data, but it may reflect un- 
derlying differences in case selection. In this regard, it is no- 
table that our data set GSE32918 is unique in representing 
the general population of DLBCL from a single geograph- 
ically defined area (62), while other data sets derive from 
multi-institutional research consortia. 

That SPIB l0W /BATF high -ABC-DLBCL is significantly 
associated with mutation of MYD88 is consistent with 
the model recently proposed by Yang et al of a role for 
SPIB/IRF4 heterodimers in repressing autocrine IFN se- 
cretion that limits ABC-DLBCL survival (16). In this re- 
gard, a striking feature of our results is the finding that 
in our population-based patient cohort SPIB high /BATF low - 
ABC-DLBCLs with a M1TO8-L265P mutation, detectable 
by Sanger sequencing, identifies a distinct group of ABC- 
DLBCL with a favourable outcome on current therapy. In- 
deed, the outcome of this small subgroup might be consid- 
ered to represent 'cure'. Since MYD88-L265P mutation sta- 
tus and mRNA expression levels are readily determined in a 
clinical setting it will be important to extend these observa- 
tions in future, and evaluate whether this combination can 
be used to prospectively identify a subset of good risk ABC- 
DLBCL cases. However, we stress that given the small pa- 
tient number in the retrospective analysis presented here, the 
result is at present only suggestive. 

The significant association of SPIB expression with 
MYD88 mutation status may also link to recent data 
demonstrating that MYD88-L265P remains dependent on 
TLRs in order to manifest its oncogenic potential (65). Fur- 
ther to this in a recent elegant study the impact of MYD88- 
L265P has been examined in murine models, demonstrating 
that the oncogenic potential of MYD88-L265P is also con- 
strained by feedback mechanisms restricting NFkB path- 
way activation and by induction of apoptosis (66). SPIB 
is part of the core transcriptional network of plasmacytoid 
dendritic cells (48,78,79), which are particularly specialized 
for cellular responses following TLR ligation (80). Consis- 
tent with this extensive occupancy of SPIB in the promot- 
ers and immediate vicinity of TLR4, TLR7 and TLR9 as 
well as MYD88 itself was evident from ChlP-seq data in 
OCI-LY3 and LY10 cells. While significant effects on TLR 
and MYD88 mRNA expression were not observed on SPIB 
knockdown, this could be explained by the transient na- 
ture of SPIB knockdown in our experiments, and the poten- 
tial for redundant regulatory input from other transcription 
factors. A particularly intriguing possibility is suggested by 
the preferential expression of TCF4 (also known as E2-2) 
in the SPIB high /BATF low -ABC-DLBCL subgroup, and the 
presence of several SPIB binding peaks across the TCF4 
(E2-2) locus in OCI-LY3 and LY10 cells. Given the impor- 
tance of E2-2 and SPIB in plasmacytoid dendritic cell de- 
velopment (48,79), these observations suggest that cooper- 
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ation between these factors may contribute to the biology 
of the SPIB high /BATF low -ABC-DLBCL. 

In conclusion, the data presented here define the relation- 
ship of IRF4 to its endogenous partners in ABC-DLBCL 
cell lines identifying BATF as a principle IRF4 partner in 
addition to SPIB in these models of lymphoma. Our data 
also indicate that a predominant input from SPIB correlates 
with a specific subgroup of primary ABC-DLBCL signifi- 
cantly associated with MYD88 mutation and a better prog- 
nosis in the context of currently standard therapy. These 
data support a model in which overriding input from SPIB 
is not a unifying feature of ABC-DLBCL, but instead con- 
tributes to heterogeneity in this subset. Our findings iden- 
tify disease heterogeneity in ABC-DLBCL intimately asso- 
ciated with the gene regulatory network controlling the ini- 
tiation of plasma cell differentiation and the activated B-cell 
program. 
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